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Abstract 

The problem of converting noisy quantum correlations between two parties into noiseless 
classical ones using a limited amount of one-way classical communication is addressed. A 
single-letter formula for the optimal trade-off between the extracted common randomness and 
classical communication rate is obtained for the special case of classical-quantum correlations. 

The resulting curve is intimately related to the quantum compression with classical side 
information trade-off curve Q*(R) of Hayden, Jozsa and Winter. 

For a general initial state we obtain a similar result, with a single-letter formula, when we 
impose a tensor product restriction on the measurements performed by the sender; without 
this restriction the trade-off is given by the regularization of this function. 

Of particular interest is a quantity we call "distillable common randomness" of a state: 
the maximum overhead of the common randomness over the one-way classical communication 
if the latter is unbounded. It is an operational measure of (total) correlation in a quantum 
state. For classical-quantum correlations it is given by the Holevo mutual information of its 
associated ensemble, for pure states it is the entropy of entanglement. In general, it is given 
by an optimization problem over measurements and regularization; for the case of separable 
states we show that this can be single- letterized. 

1 Introduction 

Quantum, and hence also classical, information theory can be viewed as a theory of inter-conversion 
between various resources. These resources can be classical or quantum, static or dynamic, noisy 
or noiseless. Based on the number of spatially separated parties sharing a resource, it can be 
bipartite or multipartite; local (monopartite) resources are typically taken for granted. In what 
follows, we shall mainly be concerned with bipartite resources. Let us introduce a notation in 
which c and q stand for classical and quantum, respectively, curly and square brackets stand for 
noisy and noiseless, respectively, and arrows (— >) will distinguish dynamic resources from static 
ones. The possible combinations are tabulated below. Noisy dynamic resources are the four types 
of noisy channels, classified by the classical/quantum nature of the input/output. Beside the 
familiar classical {c — > c} and quantum {q — > q} channels, this category also includes preparation 
of quantum states from a given set (labeled by classical indices) {c — > q} and measurement of 
quantum states yielding classical outcomes {q — > c}. Dynamic "unit" resources by definition 
require the input and output to be of the same nature, and they comprise of the noiseless bit 
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[c — > c] and qubit [g — » g] channel, but we additionally introduce symbols for general (higher 
dimensional) perfect quantum and classical channels: (q — > q) and (c — ► c), respectively. 

Noisy static resources, not having a directionality, can be one of three types: classical {cc} 
, quantum {qq} and mixed classical-quantum {cq}. The first of these is embodied in a pair 
of correlated random variables XY, associated with the product set X x y and a probability 
distribution p(x,y) = Pr{X = x,Y = y} defined on X x y. The {qq} analogue is a bipartite 
quantum system AB, associated with a product Hilbert space Ha ® and a density operator 
/r 48 , the "quantum state" of the system AB, defined on Jij^ ® H-b- A {cq} resource is a hybrid 
classical-quantum system XQ, the state of which is now described by an ensemble {p x ,p(x)}, with 
p(x) defined on X and the p x being density operators on the Hilbert space Hq of Q. The state 
of the quantum system Q is thus correlated with the classical index X . A useful representation of 
{cq} resources, which we refer to as the "enlarged Hilbert space" (EHS) representation, is obtained 
by embedding the random variable X in some quantum system A. Then our ensemble {p x ,p(x)} 
corresponds to the density operator 

p Aa = J2p(x)\x){x\ A ®p?, (1) 

X 

where {|x) : x G X} is an orthonormal basis for the Hilbert space TCa of A. Thus {cq} resources 
may be viewed as a special case of {qq} ones. Finally, we have noiseless static resources, which can 
be classical (c c) or quantum (q q) . The classical resource is a pair of perfectly correlated random 
variables, which is to say that X — y and p(x,y) = p(x)S(x,y) (without loss of generality). We 
reserve the [cc] notation for a unit of common randomness (1 rbit), a perfectly correlated pair of 
binary random variables with a full bit of entropy. The quantum resource is a quantum system AB 
in a pure entangled state \?P}ab- Again, the [qq] notation denotes a unit of entanglement (1 ebit), 
a maximally entangled qubit pair -1=(|0)^i|0)b + |1).a|1)s). Since (cc) and [cc], and (qq) and [qq] 
may be inter-converted in an asymptotically lossless way and with an asymptotically vanishing 
rate of extra resources, for most purposes it suffices to consider the unit resources only. Note the 
clear hierarchy amongst unit resources: 

[q q] => Qc ->■ c] or [qq]) [cc]. 

Any of the conversions (=>) can be performed at a unit rate and no additional cost. On the other 
hand, [c — ■> c] and [qq] are strictly "orthogonal": neither can be produced from the other. 



Dynamic unit resources 




noiseless bit channel 




noiseless qubit channel 



Noiseless dynamic resources 




general noiseless channel — w.l.o.g. identity on some set 




noiseless qubit channel — w.l.o.g. identity on some space 



Noisy dynamic resources 


{c^ 


c} 


noisy classical channel, given by a stochastic matrix W 


{c^ 


q} 


quantum state preparation, given by quantum alphabet {p x } 




c} 


generalized measurement, given by a POVM (E x ) 




q} 


noisy quantum channel, given by CPTP map M 



Unit static resources 


[cc] 


maximally correlated bits (1 rbit) 


[qq] 


maximally entangled qubits (1 ebit) 
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Noiseless static resources 


(cc) 


perfectly correlated random variables XY with distribution p(x, y) = p(x)S(x,y) 


(qq) 


bipartite quantum system AB in a pure state \iP)ab 



Noisy static resources 


{cc} 


correlated random variables XY with joint distribution p(x, y) 


{cq} 


classical-quantum system XQ corresponding to an ensemble {p x ,p(x)} 


{qq} 


bipartite quantum system AB in a general quantum state p- 46 



The generality of this classification is illustrated in the table below, where the resource inter- 
conversion task is identified for a number of examples from the literature. To interpret these 
"chemical reaction formulas", there is but rule to obey: if non-unit resources appear on the right, 
then all non-unit (dynamical) resources are meant to be fed from some fixed source. For example, 
(c — > c) in the output of a transformation symbolizes the noiseless transmission of an implicit 
classical information source, and likewise (q — > q) the noiseless transmission of an implicit quantum 
information source 



Some known problems in classical and quantum information theory 


[c — > cl =>■ (c — > c) 


Shannon compression I32I 


[q^q}=> (? -> 9) 


Schumacher compression I30I 


(q q) =^ [q q] 


Entanglement concentration 4 


[q q] + [c <-> c] => (q q) 


Entanglement dilution 4 28, 20 


[q q] + [c <-» c] =^ {q q} 


Entanglement cost, entanglement of 
purification [HJUBIIHSI 


{qq} + [c *-> c] => [qq] 


Entanglement distillation 8, 34 


{c c} + [c <-> c] ==>■ [c c] 


Classical common randomness capacity PJH] - 


{c q} + [c —r c] => [c c] 


present paper 


{qq} + [c -> c] => [cc] 


present paper 


{c->c} =>{c-> c] 


Shannon's channel coding theorem I32I 


{c -> q} => [c -> c] 


HSW theorem (fixed alphabet) 24 


{<Z -> <7} => [c -> c] 


HSW theorem (fixed channel) WW 


{9 ->• 9} =^ [9 -> q] 


Quantum channel coding theorem 34 


[c -> c] + [9 g] [g -> g] 


Quantum teleportation 6 


[9^9] + [9 9] [c -> c] 


Quantum super-dense coding 10 


{g -> g} + [gg] =^ [c -> c] 


Entanglement assisted classical capacity [2] 


{9 ->■ q} + [qq] =>■[?-► q] 


Entanglement assisted quantum capacity |5] 


[c — > c] + [cc] =>■ {c — > c} 


Classical reverse Shannon theorem [2] 


[c -> c] + [g g] => {q -> g} 


Quantum reverse Shannon theorem 


{g -> c} + [c c] => {q -> c} 


Winter's POVM compression theorem RTTI 


[c -> c] + [g g] ==> {c -> g} 


Remote state preparation 1291 1151 


[c -> c] + [g -> g] {c -> g} 


Quantum-classical trade-off in quantum 
data compression 1191 


{c g} + [c -> c] =^> (c -► c) 


Classical compression with quantum 
side information 1171 



The present paper addresses the static "distillation" (noisy =-> noiseless) task of converting 
noisy quantum correlations {gg}, i.e. bipartite quantum states, into noiseless classical ones [cc], 
i.e. common randomness (CR). Many information theoretical problems are motivated by simple 
intuitive questions. For instance, Shannon's channel coding theorem |32j quantifies the ability 
of a channel to send information. Similarly, our problem stems from the desire to quantify the 
classical correlations present in a bipartite quantum state. A recent paper by Henderson and 



3 



Vedral |21| poses this very question, and introduces several plausible measures. However, the 
ultimate criterion for accepting something as an information measure is whether it appears in the 
solution to an appropriate asymptotic information processing task; in other words, whether is has 
an operational meaning. It is this operational approach that is pursued here. 

The structure of our conversion problem is akin to two other static distillation problems: 
{ll} === ^ [qq] an d {cc} =>• [cc]. The former goes under the name of "entanglement distilla- 
tion" : producing maximally entangled qubit states from a large number of copies of p^® with 
the help of unlimited one-way or two-way classical communication jSJ. Allowing free classical 
communication in these problems is legitimate since, as already noted, entanglement and classical 
communication are orthogonal resources. The {cc} [cc] problem is one of creating CR from 
general correlated random variables, which is known to be impossible without additional classical 
communication. Now allowing free communication is inappropriate, since it could be used to cre- 
ate unlimited CR. There are at least two scenarios that do make sense, however, and have been 
studied by Ahlswede and Csiszar in and 0, respectively. In the first, one makes a distinction 
between the distilled key, which is required to be secret, and the classical communication which is 
public. The second scenario involves limiting the amount of classical communication to a one-way 
rate of R bits per input state and asking about the maximal CR generated in this way (see [2] for 
further generalizations) . One can thus think of the classical communication as a quasi-catalyst that 
enables distillation of a part of the noisy correlations, while itself becoming CR; it is not a genuine 
catalyst because the original dynamic resource is more valuable than the static one. We find that 
these classical results generalize rather well to our information processing task. The analogue of 
the first scenario pQ has been treated in an unpublished paper by Winter and Wilmink |38|. In 
this paper we generalize . As a corollary we give one (of possibly many) operationally motivated 
answers to the question "How much classical correlation is there in a bipartite quantum state?" . 

Alice and Bob share n copies (in classical jargon: an n letter word) of a bipartite quantum 
state p AB . Alice is allowed nR bits of classical communication to Bob. The question is: how 
much CR can they generate under these conditions? More precisely, Alice is allowed to perform 
some measurement on her part of (p^ B )®" ; producing the outcome random variable X( n > defined 
on some set XW>. Next, she sends Bob /(XW), where / : -> {1,2, . . . ,2 nR }. The rate R 

signifies the number of bits per letter needed to convey this information. Conditioned on the value 
of f(X^), Bob performs an appropriate measurement with outcome random variable Y^ n '. We 
say that a pair of random variables (K,L), both taking values in some set IC, is permissible if 

K = K(X^) 

L = I(F("),/(lW)). 

A permissible pair (K, L) represents e-common randomness if 

Pr(K ^L)<e. (2) 

In addition we require the technical condition that K and L are in the same set satisfying 

|/C| < 2 C '" (3) 

for some constant c'. Thus, strictly speaking, our CR is of the (cc) type, but it can easily be con- 
verted to [c c] CR via local processing (intuitively, we would like to say "Shannon data compression" , 
only that the randomness thus obtained is not uniformly distributed but "almost uniformly" in 
the sense of the AEP |12|). A CR-rate pair (C,R) of common randomness C and classical side 
communication R is called achievable if for all e, S > and all sufficiently large n there exists a 
permissible pair (K, L) satisfying J5J and © , such that 

-H{K) >C-S. 
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We define the CR-rate function C(R) to be 

C{R) = sup{C* : (C,R) is achievable}. 

One may also formulate the C(R) problem for Alice and Bob sharing some classical-quantum 
resource XQ rather than the fully quantum AB. In this case Alice's measurement is omitted since 
she already has the classical random variable X^ n ' = X n . In the original classical problem [5] 
Alice and Bob share the classical resource XY. There Bob's measurement is also omitted, since 
he already has the random variable Y^ n > = Y n . Finally, we introduce the distillable CR as 

D(R) = C(R) - R, (4) 

the amount of CR generated in excess of the invested classical communication rate. This suggests 
D(oo) as a natural asymmetric measure of the total classical correlation in the state. As we shall 
see, the above turns out to be equivalent to the asymptotic ("regularized") version of Ca(p AB ), as 
defined in \21\ . 

The paper is organized as follows. First we consider the special case of {cq} resources for which 
evaluating C(R) reduces to a single-letter optimization problem. Then we consider the {qq} case 
which builds on it rather like the fixed channel Holevo- Schumacher- Westmoreland (HSW) theorem 
builds on the fixed alphabet version. 

2 Classical-quantum correlations 

In this section we shall assume that Alice and Bob share n copies of some {c, q} resource XQ, 
defined by the ensemble £ = {p x ,p(x)} or, equivalently, equation 0J. Alice knows the random 
variable X and Bob possesses the d-dimensional quantum system Q. In what follows we shall make 
use of the EHS representation to define various information theoretical quantities for classical- 
quantum systems. The von Neumann entropy of a quantum system A with density operator p A 
is defined as H{A) = — Ti p A \ogp A . For a bipartite quantum system AB define formally the 
quantities conditional von Neumann entropy 

H{B\A) = H(AB) - H(A), 

and quantum mutual information (introduced earlier as "correlation entropy" by Stratonovich ) 

I{A; B) = H{A) + H{B) - H(AB) = H(B) - H{B\A). 

For general states of AB we introduce these quantities without implying an operational meaning for 
them. (Though the quantum mutual information appears in the entanglement assisted capacity 
of a quantum channel UJ, and the negative of the conditional entropy, known as the coherent 
information appears in the quantum channel capacity |81l34|.1 

Introducing these quantities in formal analogy has the virtue of allowing us to use the familiar 
identities and many of the inequalities known for classical entropy. This to us seems better than 
claim any particular operational connection (which, by all we known about quantum information 
today, cannot be unique anyway). 

Subadditivity of von Neumann entropy implies I(A; B) > 0. For a tripartite quantum system 
ABC define the quantum conditional mutual information 

I(A; B\C) = H{A\C) + H(B\C) - H{AB\C) = H{AC) + H(BC) - H(ABC) - H{C). 

Strong subadditivity of von Neumann entropy implies I(A;B\C) > 0. A commonly used identity 
is the chain rule 

I(A; BC) = I{A;B) + I(A;C\B). 
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Notice that for classical-quantum correlations QJ the von Neumann entropy H{A) is just the 
Shannon entropy H(X) of X. We define the mutual information of a classical-quantum system 
XQ as I(X; Q) = I (A; Q). Notice that this is no other than the Holevo information of the ensemble 
£ 

X (£) = H {y^p{x) Px \ - Y,P{x)H{p x ). 

\ X / X 

(Even though Gordon and Levitin have written down this expression much earlier — see j^l] 
for historical references — , we feel that the honour should be with Holevo for his proof of the 
information bound named duly after him [2HJ-) 

Using the EHS representation for some tripartite classical-quantum system UXQ, strong sub- 
additivity [231 gi yes inequalities such as I(U;X\Q) > or I(U;Q\X) > and the chain rule 
implies, e.g., 

I(U;XQ) = I(U;Q) + I(U;X\Q). 

We shall take such formulae for granted throughout the paper. 

An important classical concept is that of a Markov chain of random variables U — > X — > Y 
whose probabilities obey Pr{F = y\X = x.U = u} = Pr{Y = y\X = x}, which is to say that 
Y depends on U only through X . Analogously we may define a classical-quantum Markov chain 
U — > X — > Q associated with an ensemble {p ux ,p(u, x)} for which p ux = p x . Such an object 
typically comes about by augmenting the system XQ by the random variable U (classically) 
correlated with X via a conditional distribution Q(u\x) = Pr{U = u\X = x}. In the EHS 
representation this corresponds to the state 

p zaq = J2p(x)J2Q(u\x)\u)(u\ Z ® \x){x\ A ®p° (5) 

X u 

We are now ready to state our main result. 

Theorem 1 (CR-rate theorem for classical-quantum correlations) 

C(R) = C*(R)=R + D*(R), (6) 

where 

D*(R) = su P {/([/; Q) | I(U; X) - I(U; Q) < R}. (7) 
u\x 

The supremum is to be understood as one over all conditional probability distributions p(u\x) for 
the random variable U conditioned on X , with finite range U. We may in fact restrict to the case 
\U\ < \X\ + 1, which in particular implies that the sup is actually a max. 

The proof of the theorem is divided into two parts: show that C*(R) is an upper bound 
(commonly called the "converse" theorem) for C(R), and then providing a direct coding scheme 
demonstrating its achievability. We start with a couple of lemmas. 

Lemma 2 D*(R), and hence C*{R), is monotonically increasing and concave; the latter meaning 
that for Ri,R 2 > and < A < 1, 

A£>*(i?i) + (1 - X)D*(R 2 ) < D*(XRi + (1 - X)R 2 ). 

Proof The monotonicity of D*(R) is obvious from its definition. To prove concavity, choose U\, 
U2 feasible for R\, R 2 , respectively: in particular, 

I(Ui;X) - I(Ui\ Q) < R u 
I(U 2 ;X)-I(U 2 ;Q) < R 2 . 
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Then, introducing the new random variable 

U - 



we have 



Thus 



the last step from 



(1, U±) with probability A, 
(2, U2) with probability 1 — A, 



XI(U 1 ;X) + (1-X)I{U 2 ;X) = I(U;X), 
M(U 1 ;Q) + (1-X)I(U 2 ;Q) = I(U;Q). 

\I(Uv, Q) + (1 - X)I(U 2 ; Q) = I(U; Q) < D*(R), 
I(U; X) - I(U; Q) < \R t + (1 - A)i? 2 < R. 



Consider the n copy classical-quantum system X n Q n — X1Q1X2Q2 ■ ■ ■ X n Q n , in the state 
given by the nth tensor power of the ensemble {p x , p(x)} . Define now 

D* n (R) = max V-I{U; Q n ) | ± (I(U; X n ) - I(U; Q")) < i?| . (8) 

It turns out that this expression may be "single-letterized" : 

D*(R)=D*(R). 

We prove slightly more by showing the following lemma, which implies the above equality by 
iterative application and then using concavity of D* in R (lemma 

Lemma 3 For two ensembles £\ = {p x ,p(x)} (x G X\) and £2 — {o~ x ' ,p' {x')} (x 1 6 X2), denote 
their respective D* functions D*(£\,R) and D*(£2, R). Then 

D*(£ 1 ®£ 2 ,R)=max{D*(£ l ,R 1 ) + D*(£ 2 ,R 2 )\R 1 + R 2 = R}. 

Proof Let £\ and £ 2 correspond to the classical-quantum systems XiQx and X2Q2, respectively. 
As before, we augment the joint system by the random variable U via the conditional distribution 
Q{u\xx'), so that UX^Q^ obeys the Markov property U -> X X X 2 ->■ QiQ 2 . In the EHS 
representation we have 

p ZA 1 A 2 Q 1 Q 2 = J2 p{x)p l {x l )Q{u\xx')\u){u\ Z ® \x)(x\ Al ® \x'){x'\ A2 ®a^,\ 

By definition, D* (£%®£ 2 , R) equals I(U ; Qi Q 2 ) maximized over all variables U such that I(U ; X\X 2 ) 
I(U;QxQ 2 )<R. 

Now the inequality ">" in the lemma is clear: for we could choose U\ optimal for £\ and R\ 
and U 2 optimal for £2 and R2, and form U = U\U 2 . By elementary operations with the definition 
of D* we see that D*(£ 1 ,R 1 ) + D*(£ 2 , R2) is achieved. 

For the reverse inequality, let U be any variable such that I(U ; XiX 2 ) — I(U; QiQ 2 ) < R. First 
note that the Markov property U -> X X X 2 -> Q1Q2 implies I(U;X 1 X 2 ) = I(U; X 1 Q 1 X 2 Q 2 ), 
which can easily be verified in the EHS representation. Intuitively, possessing Q1Q2 in addition to 
knowing X1X2 conveys no extra information about U. Hence, by the chain rule, 

/(£/; X x Xi) - I(U; Q X Q 2 ) - I{U; X 1 X 2 |Q 1 Q 2 ). 
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Now, using the chain rule and once more the fact that the content of Q x is a function of X x , we 
estimate 

R > I(U;X 1 X 2 \Q 1 Q 2 ) 

= I(U;X 1 \Q 1 Q 2 )+I(U;X 2 \Q 1 Q 2 X 1 ) 
= /(C/;X 1 |Q 1 Q 2 )+/(C/;X 2 |Q 2 X 1 ). 
> /(E/;Xi|Qi) + /(t/;X 2 |Q 2 Xi). 

Here the inequality of the last line is obtained by the following reasoning: 

I(U;X X \Q X Q 2 ) = /(i/Q2;X 1 |Q 1 )-7(X 1 ;Q 2 |Q 1 ) 
> I{U;X x \Q x )-0, 

using strong subadditivity and the fact that X X Q X — X 2 Q 2 is in a product state. 
Hence there are R x and R 2 summing to R for which 

I(U;X X )-I(U;Q X ) = I{U;X X \Q 1 ) < R x , (9) 
I{U;X 2 \X X )-I{U;Q 2 \X X )=I{U;X 2 \Q 2 X X ) < R 2 . (10) 

On the other hand, 

I{U:Q X Q 2 ) = I{U;Qx)+I{U;Q 2 \Qx) 

= I{U;Qi) + I{UQ x ;Q 2 )-I(Q x ;Q 2 ) 

< I(C/;Qi) + /(C/X i; Q 2 ) 

= I(U;Q 1 ) + I(X li Q 2 ) + I(UiQ2\X 1 ) 

= I(U;Q 1 )+I(U;Q 2 \X X ), (11) 

using the chain rule repeatedly; the inequality comes from the quantum analogue of the fa- 
miliar data processing inequality [3J, another consequence of the content of Q x being a func- 
tion of X x . With © and by definition of D* , I(U; Q x ) < D*{£ U R X ). But also, with lfTU|) . 
I(U; Q 2 \X X ) < D*(£ 2 , i? 2 ), observing that the conditional mutual information in as well as in 
H10(l are probability averages over unconditional mutual informations, and invoking the concavity 
of£>* (lemma EJ. 
Hence, 

I(U; Q X Q 2 )<D*{£ X ,R X ) + D*(£ 2 ,R 2 ), 
and since U was arbitrary, we are done. B 

Proof of Theorem 1 (converse) For a given blocklength n, measurement on Bob's side will turn 
the classical-quantum correlations into classical ones, and Q n gets replaced by the measurement 
outcome random variable Y^ n >. Now we can apply the classical converse [2] to the classical random 
variable pair (X n ,Y^) 

C(R) < R+ max I -I(U; Y^) I I(U; X n ) - I(U; yW) < nR 
u\x n [n 

By the the Holevo inequality [53] 

I{U;Y<ri) < I(U; Q n ), 

this can be further bounded by C*(R) which is, by lemma UJ equal to C*(R). To complete the 
proof, we need to show that the supremum in (|7J) can be restricted to a set U of cardinality 
1^1 < \X\ + 1- This is a standard consequence of Caratheodory's theorem, and the proof runs in 
exactly the same way as that in, e.g., ■ 

We shall need some auxiliary results before we embark on proving the achievability of C* (R) . 



8 



Lemma 4 The (C, R) pair (H(X), H(X\Q)) is achievable when Alice and Bob share the classical- 
quantum system XQ. 

Proof This follows from the classical-quantum Slepian- Wolf result ^7] which states that, for any 
e, 5 > and sufficiently large n, the classical communication rate from Alice to Bob sufficient for 
Bob to reproduce X n with error probability < e is H(X\ Q) + 6. ■ 

Remark Lemma 01 already yields the value of 

D(oo) = D(H(X\Q)) = H(X) - H(X\Q) = I{X; Q) 

for the classical-quantum system XQ. This justifies our interpretation of D{oo) as the amount of 
classical correlation in XQ. 

Lemma 5 Let a be a state in a D -dimensional Hilbert space. Then Tr (o~B) = 1 — e for some 
operator < B < 1 implies 

H{a) < 1 + e log D + (1 - e) log(Tr B + 1) (12) 
Proof Diagonalize a as a = Y^f=i Pj\j)(j\ with pi < P2 ■ ■ ■ < Pd and define bj = (j\B\j), so that 

Y,Pjb J = l-e (13) 

3 

and Tri? = Y] ^ bj . Further define the random variable J with Pr{J = j} = pj, for which 

H(a) = H(J). Consider the vector b D which minimizes subject to constraints l|13fl and 

< bj < 1. This is a trivial linear programming problem, solved at the boundary of the allowed 
region for the bj. It is easily verified that the solution is given by 

h = ... = 6 fc _! =0, 

<b k <1, 
b k +i = ■ ■ ■ = b D = 1, 

for some 1 < k < D for which (|13|l is satisfied. Note that 

D-k< Y^bj <TvB 
j 

and y^j—i P-i < £• Define the indicator random variable I(J) 

1 J > k, 
otherwise. 

We then have 

H(J) = H(I) + H{J\I) 

< 1 + Pr{ J = 0} logD + Pr{ J = 1} log(D + 1 - k) 

< l + elogD + (l-e)log(TrS + l), 

which proves the lemma. H 

In order to understand the next two results, some background on typical sets Tffg, conditionally 
typical sets s (u n ), typical subspaces Uq s and conditionally typical subspaces n T g| C/5 (M n ) is 
needed E3 ■ This is provided in the Appendix. 
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Lemma 6 For every e, S > and set £ C X n with Pr{A" E £} > e, there exists a subset T C £ 
and a sequence u n G T^ s such that 



\og\T\ -H(X\U) 



whenever n > n±(\U\, \X\, e, S). In addition, whenever n > ti 2 (|£Y|, \X\, d, e, S), 

-H(Q n \X n ef)< H(Q\U) + S. 



(14) 



(15) 



Proof Clearly, it suffices to prove the claim for some sufficiently small e. The first claim (|14|) is a 
purely classical result and corresponds to lemma 3.3.3 of Csiszar and Korner ^3]. Thus it remains 
to demonstrate I] 15(1. We shall need the following facts from the Appendix. For sufficiently large 
n > n {\U\, \X\,d,5',e), for x n G T^ us ,{u n ) and u n G Tfiy. 



Ti(p u n x nU n Qmixl+1)s ,(u n )) >l-e, 



and 



TVTT" (ll n ) < O nH (QW) + ^+\X\)cS' 

LT ll Q\U,(\X\ + l)8'\ u I - L 

Since p u n x n — px n , it follows from the linearity of trace and (|16fl that 

T ^ n S|£/,(|*| + l)^")) > 1 - £ , 

where 

PT = $>r{X n = x n \X n G T} Pxn . 



Finally, combining with i|17[l and lemma [S] 
1 



-H(Q n \X n G T) = ff(p^) < £T(Q|Z7) 



1 



e log <i + cS' . 



For sufficiently small e < 5' , and setting 712 = max{no, n\, 6' 1 }, l|15|) follows with 

r. s 



(2 + \X\)c+l + logd 



(16) 
(17) 



Corollary 7 For every e,S > and n > 7i2(|£/|, |Af|,d, 5, e) there exists a function g : X n — > W n 

iif(Q"| 5 (X"))< J ff(Q|C/) + 5, (18) 



F(X n | 5 (X n )) -if(XlU') 



< <5. 



(19) 



Proof Again it suffices to prove the claim for sufficiently small e. By an iterative application of 
lemma we can find disjoint subsets T\, ■ ■ . , Tm of X n such that 



M 



Pr{A» i |J T a } < e 



and for some sequences u™ G ^75, a = 1, 



-log|^ tt |-^(A|C/) 
n 



<5 

< - 
~ 2 
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and 



-H(Q n \X n ^T a )<H(Q\U)+ 5 -. 
n 2 



Define, choosing some Uq different from the 



x 11 e T a 
otherwise. 



Then 



and 



-H(Q n \g(X n )) < H(Q\U) 



-H{X n \g{X n )) - H{X\U) < 



eH(Q) 



eH(X). 



Finally, choose e < max{ 2 h{Q) i 2H(x) }• " 

We are now in a position to prove the direct coding part of theorem ^ 

Proof of Theorem 1 (coding) We first show that (C,R) = (I(U; X), I(U; X) 
achievable. We follow the classical proof 2 closely. Define K(X) — g(X). Then 

-H(X n \K) = H(X) - —H(K) 
n n 



I(U; Q)) is 



and (|19|l imply 

Also by O and (J^UJl we have 



-H(K)-I(U;X) 
n 



< S. 



(20) 



-(H(K) - I(K; Q™)) < I(U; X) - I(U; Q) + 25. 
n 

Note that lemma01 applied to the supersystem KQ n guarantees the achievability of (H(K), H(K) — 
I(K; Q n ). Hence, for sufficiently large (super)blocklength k there exists a mapping f(K k ) of 
rate ^log|/| < I(U;X) - I(U; Q) + 28 (here |/| is the image size of /), which allows K k to 
be reproduced with e error. This yields an amount of e-randomness bounded from below by 
nk(I(U; X)~5). However, to prove the claim, we need to show that the rate is bounded from above 
by exactly I(U; X) —I(U ; <2). This is accomplished by setting the blocklength to N = nk(\ + 25k), 



where k = 



I{U\X)-I(U;Q) ' 



and ignoring the last 25nnk source outputs. Then indeed 



R=-log\f\<I(U;X) 



m q) 



while 



C = —H(K k ) > I(U;X) - 5(k' + 2k), 



with k 1 = j^jry 

If now the classical communication rate R' is available, we may use the procedure outlined 
above to achieve a CR rate of I(U;X) while communicating at rate R = I(U;X) — I(U; Q), at 
least if R < R' . But of course the "surplus" R'— R is then still free to generate common randomness 
trivially by Alice transmitting locally generated fair coin flips. This shows that at communication 
rate R' , CR at rate 

C =R' -R + I(U; X) = R' + I(U; Q) 

can be generated. ■ 
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Remark For R < H(X) — I(X;Q) = H(X\Q), the maximization constraint in (J7J) may be 
replaced by an equality, i.e., 

D*(R) = D(R) (21) 

where 



To see this, note that 



D(R) = max{7(£/; Q) | 7(17; X) - I(U; Q) = R}. 



D*(R) = max D(R), 

0<R'<R 



so it suffices to show that D{R) is monotonically increasing. This, in turn, holds if D(R) is concave 
and achieves its maximum for R = H(X\Q). The concavity proof is virtually identical to the proof 
of lemma The second property follows from 

I(U; Q) < I(UX; Q) = I(X; Q) 

and I(U;X\Q) < H(X\Q). 

Note that for R > H(X\Q), the function D*(R) is simply constant (and equal to £>(oo) = 
I(X;Q)). 

Having established 1|21|1 . we shall now relate D*(R) to the quantum compression with classical 
side information trade-off curve Q*(R) of Hayden, Jozsa and Winter [UJJ. For a classical-quantum 
system XQ, given by the pure state ensemble {\ip x ) ,p(x)}, and R < H(X), 

Q*(R) = usm{H(Q\U) \ I(U;X) = R} = H(Q) - max{7([/; Q) \ I(U;X) = R}. 

(For rates R > H(X), Q*{R) = 0.) 

The following relation to our C*(R) is now easily verified: 

D*(x)+Q*(D*(x)+x) =77(Q). (22) 

Indeed, for x < H(X\Q), and a maximizing variable U, x = I(U;X) — I(U; Q) and D*(x) = 
I(U; Q). Then, x + D*(x) = I(U;X), so U is feasible for Q*(x + D*(x)) and indeed optimal, using 
once more the monotonicity of D. 

We should remark, however, that to the best of our knowledge, eq. Ij22(l has no simple opera- 
tional meaning. Still, it allows us to "import" the numerically calculated trade-off curves from |19| 
for various ensembles of interest: the curves are then parametrized via s = x + D*(x) and x. 

Figure 1 (cf. 19 , figure 2) shows the distillable CR-rate trade-off curve D(R) = D*(R) for the 
simple two-state ensemble £ given by the non-orthogonal pair {|0}, ^(|0) + |1))}, each occurring 

with probability \. This curve is not much better than the linear lower bound obtained by time- 
sharing between (0,0) and the Slepian-Wolf point (1 — H(£), H(£)), where H(£) denotes the 
entropy of the average density matrix of the ensemble £. 

Figure 2 (cf . |T2| , figure 4) corresponds to the three state ensemble £3 consisting of the states 
\tpi) — |0), \(fi) — -i|(|0) + |l)) and |<£ 3 ) = |2) with equal probabilities. Without any communication 

it is already possible to extract ^2(5) bits of CR, due to Bob's ability to perfectly distinguish 
whether his state is in {|<pi), IV2)} or {|v?3)}- The curve then follows a rescaled version of figure 1 
to meet the Slepian-Wolf point (77(|, |, |) - 77(£ 3 ), 77(£ 3 )). 

Our third example is the parametrized BB84 ensemble £bb (9) , defined by the states 

\Vi) = |0) 

I ^2) = cos(9|0) +sin0|l) 

l¥>3> = |1> 

\(p 4 ) = -sin(9|0) +cos6»|l), 
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Figure 3: D(R) for the parametrized BB84 ensemble with # = §■ 

each chosen with probability i. The D(R) curve for 9 — ir/8, shown in figure 3 (cf. 19 , figure 5), 
has a special point at which the slope is discontinuous. For < 9 < tt/4, £bb(6) has a natural coarse 
graining to the ensemble consisting of two equiprobable mixed states, + I^X^I) and 

\ (1^3) (^3 1) + 1^4) (^4 1)- The special point is precisely the Slepian-Wolf point for this coarse-grained 
ensemble, treating \tp\) and \^p-i)i and |</?3) and \ip4) as indistinguishable. 

Finally, figure 4. (cf. 19;, figure 5 and ^Hj) shows D(R) for the uniform qubit ensemble, a 
uniform distribution of pure states over the Bloch sphere. Strictly speaking, theorem 1 should be 
extended to include continuous ensembles; we shall not do this here, but merely conjecture it and 
refer the reader to |19| for an example of such an extension. The curve approaches D — 1 only in 
the R — ► 00 limit. It has an explicit parametrization computed from (1221 and 15 : 

r = ^^) + ^- 2+i ° g G^i) 

D(R) = i-fcQ --,!_) 
for A € (0, 00), where h,2{p) = —plogp — (1 — p) log(l — p) is the binary Shannon entropy. 

3 General quantum correlations 

Consider the following double-blocking protocol for the case of {qq} resources: given a word of 
length nL, Alice performs the same measurement on each of the n blocks of length L. This leaves 
her with n copies of the resulting {cq} resource, to which we apply the {cq} protocol described in 
the previous section. Letting n — > 00 and then L — > 00 yields the same results as the most general 
protocol described in Section 1. Let us assume L — 1 for the moment. The measurement M. on 
Alice's subsystem A, defined by the positive operators (E x ) x( zx with J2 X E x = 1, may be thought 
of as a map sending a quantum system AB in the state p* 48 to a classical-quantum system XQ in 
the state given by the ensemble {p x ,p(x)}, where 

p(x) = Tr A (p A E x ) , 
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1 2 3 4 5 6 7 

R 

Figure 4: D(R) for the uniform ensemble. 



All the relevant information is now encoded in the shared ensemble. Theorem 1 now applies, 
yielding an expression for the L = 1 CR-rate curve: 

C {1) (R) = R+ max max{/([/; Q) I I(U; X) - I(U; Q) < R\. (23) 

M:AB^XQ U\X ~~ J 

Similarly we have 

L> (1) (oo) = max 7(X; Q). (24) 

which is precisely the classical correlation measure Cj^{p AB ) proposed in [21]. Note that w.l.o.g. we 
may assume the measurement to be rank-one, and \X\ < d 2 , d the dimension of the „4-system, 
because a non-extremal POVM cannot be optimal. 

However, in general one must allow for "entangling" measurements performed on an arbitrary 
number L copies of p AB yielding an expression for C^ L \R) analogous to H23(l : 

C (L \R) = R+ max - max{/(/7; Q) I I(U; X) - I(U; Q) < R\. 

M:A L B L ^XQ L U\X L ' 

Finally, taking the large L limit gives 

C{R) = lim C {L) (R). 

L — >oo 

Similarly 

D(oo) = lim L> (L) (oo), 

L — >OQ 

which is the "regularized" version of DW(oo) and the more appropriate asymmetric measure of 
classical correlations present in the bipartite state p AB . It is an interesting question whether L = 1 
suffices to attain C(R), or at least -D(oo). In the remainder of this section we present some partial 
results concerning this issue. 

Example 8 Let Alice and Bob switch roles: consider a state 
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i.e. now Alice holds the ensemble states p x while Bob has the classical information x, with proba- 
bility p(x). 

According to J2U, Z)W(oo) is equal to the accessible information of the state ensemble £ = 
{p x ,p(x)}, denoted I a cc(£) [22] • On the other hand, we know from [221 ^at Iacc(£ ® £') = 
Iacc(£) + Iacc(£'), for a second ensemble hence 

,0(00) = DW(oo) = £>«(oo) - 7 acc (£). 

This single- letterization of the accessible correlation can, in fact, be generalized to arbitrary 
separable states. Indeed, the following holds, in some analogy to the additivity of capacity for 
entanglement breaking channels (we include the state dependence in our notation of 
etc.): 

Theorem 9 Let p AB be separable and o~ A B be arbitrary. Then, 

D {1) {p® cr, 00) = D<V(p, 00) +L> (1) (cr,Oc). 

From this, by iteration, we get of course 

D(p, 00) = D {L) {p, 00) = D (1 \p, 00). 

Proof (p ® a, 00) > (p, c>o) + (a, 00) is trivial for arbitrary states, for we can always 

use product measurements. For the opposite inequality, we write p as a mixture of product states: 



P 



AB 



if <» if, 



which can be regarded as part of a classical-quantum system JAB with EHS representation 

3 

whose partial trace over J it obviously is. 

Now we consider a measurement Ai — (E x )xex on the combined system AA' . Then, by 
definition, the post measurement states on BB' and the probabilities are given by 



p(x)p x = Tr AA , [(p AB ®a A ' B ')(E [ 
= J2 q i T ?® TrAA 



•AA' 



{r A ®a A ' B ' 



AA' 



<8>1 



A!B' , 



F, 



with the POVMs Afj = (F x y) xeX on A', labeled by the different j; 

F x{j = Tr A (E x {fj®l)). 

Thus, applying the measurement M on AA' on p^ AB <g> a A B , and storing the result in X leads 
to the classical-quantum system XJBB' defined by the EHS state 



= \ X ^ X \ C ® iMUf ® rf ® Tr^ 



a!b' , 



F xlj <8> l) 
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With respect to it, 



I{X;BB') 



= I(X 


B)- 


VI{X-B'\B) 




= Hx 


B)- 


VI{XB;B') - 


I(B;B') 


= I(X 


B)- 


VI{XB;B') 




< i(x 


B)- 


VI{XJ;B') 




= I(X 


B)- 


VI{XJ;B') - 


I(J;B') 


= I(X 


B)- 


VI{X-B'\J), 





(25) 

using the chain rule, the fact that BB' is in a product state, the data processing inequality (3J, the 
fact that JB' is in a product state and the chain rule once more. 

In (|25|l notice that the first mutual information, I(X; B), relates to applying the POVM M. to 
A, with an ancilla A' in the state a A - but this can be described by a POVM Af on A alone. The 
second, I(X; B'\ J), is a probability average over mutual informations relating to different POVMs 
on A'. Thus 

I(X;BB') < D^{ Pl oo) + D^\a,oo), 
which yields the claim, as M. was arbitrary. I 



Example 10 For a pure entangled state ip = \ip)(ip\, we can easily see that 

£> (1) 0/>, oo) = D^(ip, 0) = E(\ip)) = H(Ti B ip). 

Indeed, the right hand side is attained for Alice and Bob both measuring in bases corresponding 
to a Schmidt decomposition of \if>). On the other hand, in the definition of D^ x \ eq. (|24(l . the 
mutual information I(X; Q) is upper bounded by H(Q), which is the right hand side in the above 
equation. 

Thus, if both ip and if are pure entangled states, 

D {1) (V> <g> <p, oo) = D {1) (ip, oo) + L> (1) 0, oo). 

In particular, 

L>0,oo) = L> (L) (V>,oo) = D (1) (t/>,oo). 

More generally, we have (compare to the additivity of channel capacity if one of the channels 
is noiseless |31|1: 

Theorem 11 Let p AB = be pure and a A B arbitrary. Then 

D {1 \p® cr, oo) = D^\ Pl oo) +£> (1) (o-,oo). 

Proof As usual, only "<" has to be proved. Given any POVM M = (E x ) xeX on AA! , the 
classical-quantum correlations XBB' remaining after this measurement is performed are described 

by 

X 

We shall assume that \ijj) is in Schmidt form: 

i^) = Ev / A7b)- 4 b> 

3 



(p AB ®a A ' B ')(E AA ' ®l) 
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Measuring in the basis \j) on B and recording the result in orthogonal states \ in a register J 
transforms cj into the state 



We claim that 

I U (X;BB')<I U .(X;B'\J) + H U (B), (26) 

where the subscript indicates the state relative to which the respective information quantity is 
understood. Clearly, from this the theorem follows: on the right hand side, the entropy is the 
entropy of entanglement of p, and the mutual information is an average of mutual informations for 
measurements M.j on A', defined as performing M. with ancillary state \j)(j\ on A. 

To prove l|26|) . we first reformulate it such that all entropies refer to the same state. For this, 
observe that the measurement of j can be done by adjoining the register J in a null state |0), 
applying a unitary which maps |j) B |0}' 7 to \j) , and tracing out B. Denote by 17 the state 
obtained from u> by this procedure. Obviously then, (|26[1 is equivalent to 

I{X; BJB') < I{X; B'\J) + H(BJ), (27) 

with respect to ft, because isometries do not alter entropies. 

Now, writing out the above quantities as sums and differences of entropies, and using the fact 
that BJ — B' is in a product state, a number of terms cancel out, and (|27|l becomes equivalent to 

H{BJB'\X) > H{B'\XJ). 

But now rewriting the left hand side, using H (BJ\X) > (because it is an average of von Neumann 
entropies), we estimate: 

H{BJB'\X) = H{B'\BJX)+H{BJ\X) 

> H{B'\BJX) 

> H(B'\JX), 

where in the last line we have used strong subadditivity, and we are done. I 

We do not know if additivity as in the above cases holds universally, but we regard our results 
as evidence in favor of this conjecture. 

Returning to finite side-communication, it is a most interesting question whether a similar 
single-letterization can be performed. We do not know if an additivity-formula, similar to the one 
in lemma 01 for classical-quantum correlation, holds for the rate function D^\p (g> <j,R). In fact, 
this seems unlikely because its definition does not even allow one to see that it is concave in R 
(which it better had to if it be equal to the regularized quantity). Of course this can easily be 
remedied by going to the concave hull of D^: note that both regularize to the same function 
for L — > oo. However, we were still unable to prove additivity for D^. This would be a most 
desirable property, as it would allow single-letterization of the rate function just as in the case 
of classical-quantum correlations. As it stands, D^(p,R) is the CR obtainable from p in excess 
over R, if (one-way) side communication is limited to R and if the initial measurement is a tensor 
product. 

4 Discussion 

We have introduced the task of distilling common randomness from a quantum state by limited 
classical one-way communication, placing it in the context of general resource conversion problems 
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from classical and quantum information theory. Our exposition can be read as a systematic ob- 
jective for the field of quantum information theory: to study all the conceivable inter-conversion 
problems between the resources enumerated in the Introduction. 

Our main result is the characterization of the optimal asymptotically distillable common ran- 
domness C (as a function of the communication bound R) ; in the case of initial classical-quantum 
correlations this characterization is a single-letter optimization. 

A particularly interesting figure is the total "distillable common randomness" , which is the 
supremum of C(R) — R as R — > oo: for the classical-quantum correlations it turns out to be simply 
the quantum mutual information, and in general it is identical to the regularized version of the 
measure for classical correlation put forward by Henderson and Vedral |21|. 

It should be noted that this quantity is generally smaller than the quantum mutual information 
I (A; B) of the state p AB (which was discussed in but larger than the quantity proposed by 

Levitin |2flj . Interestingly, while the former work simply examines a quantity defined in formal 
analogy to classical mutual information for its usefulness to (at least, qualitatively) describe quan- 
tum phenomena, the latter motivates the definition by recurring to operational arguments. Of 
course, all this shows is that there can be several operational approaches to the same intuitive 
concept: quantities thus defined might coincide for classical systems but differ in the quantum 
version. 

This is what we see even within the realm of our definitions. In the classical theory jz] the 
total distillable CR equals the mutual information of the initial distribution, regardless of the 
particulars of the noiseless side communication: whether it is one-way from Alice to Bob or vice 
versa, or actually bidirectional, the answer is the mutual information. There are simple examples 
of quantum states where the total distillable common randomness depends on the communication 
model: the classical-quantum correlation associated with an ensemble £ — {p x ,p(x)} of states at 
Bob's side (compare eq. Q) leads to I(A; Q) = x{£) if one-way communication from Alice to 
Bob is available. If only one-way communication from Bob to Alice is available, it is only J a cc(£), 
the accessible information of the ensemble £, which usually is strictly smaller than the Holevo 
information %(£) 

An open problem left in this work is to decide the additivity questions in section is the 
distillable common randomness D^(p, oo) additive in general? Does the rate function D^'(p, R) 
obey an additivity-formula like the one in lemma I3P Finally, there is the issue of finding the 
"ultimate" distillable common randomness involving two-way communication. 

Acknowledgments We thank C. H. Bennett, D. P. DiVincenzo, B. M. Terhal, J. A. Smolin 
and R. Abbot for useful discussions. ID's work was supported in part by the NSA under the US 
Army Research Office (ARO), grant numbers DAAG55-98-C-0041 and DAAD19-01-1-06. AW is 
supported by the U.K. Engineering and Physical Sciences Research Council. 

A Appendix 

We shall list definitions and properties of typical sequences and subspaces OEDHUZI- Consider the 
classical-quantum system UXQ in the state defined by the ensemble {p(u, x), p ux }. X is defined 
on the set X of cardinality s\ and U on the set U of cardinality s%. Denote by p(x) and P(x\u) 
the distribution of X and conditional distribution of X\U respectively. 

For the probability distribution p on the set X define the set of typical sequences (with <5 > 0) 

Tp,& = {x n ■ Vx \N(x\x n ) - np(x)\ < n5} , 

where N(x\x n ) counts the number of occurrences of x in the word x n — x\...x n of length n. 
When the distribution p is associated with some random variable X we may use the notation $ . 
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For the stochastic matrix P : IA — > X and u n € U n define the set of conditionally typical 
sequences (with 5 > 0) by 

Tp s (u n ) = {x n : \/u,x \N((u,x)\{u n ,x n )) - P{x\u)N(u\u n )\ < nS} . 

When the stochastic matrix P is associated with some conditional random variable X\U we may 
use the notation Tg: us (u n ). 

For a density operator p on a d-dimensional Hilbert space H, with eigen-decomposition p = 
Y2k=i ^k\k)(k\ define (for 8 > 0) the typical projector as 

fc»er» 4 

When the density operator p is associated with some quantum system Q we may use the notation 
For a collection of states p u , u 6 and u n £ t/™ define the conditionally typical projector as 

n ?p u }>") = (S) n p;> 

where / u = {i : it, = u} and IT^ 5 denotes the typical projector of the density operator p u in the 
positions given by the set I u in the tensor product of n factors. When the {p u } are associated with 
some conditional classical-quantum system system Q\U we may use the notation s (u n ). We 
shall give several known properties of these projectors, some of which are used in the main part 
of the paper. For any positive e,S and 6', some constant c depending on the particular ensemble 
of UXQ, and for sufficiently large n > no(e,S,5'), the following hold. Concerning the quantum 
system Q alone: 

Trn^ s < 2 n{H{Q)+cS) 
Trp^n^ > 1-e. 

Concerning the classical-quantum system XQ, and for x n € T~x s 1 '- 



Trll^^O™) < 2 r ^ H ^ x ^ +c{5+s '^ (28) 
TT Pxn U n QlXtS (x n ) > 1-e 

frp±»Vv, B+ \x\s> ^ l ' e - ( 29 ) 

These have been proven in |3fi| . Finally, concerning the full classical-quantum system UXQ, 
for x 11 e T£\ uy (u n ) (Unj easily extends to 

Pu^x^q\v,s+\x\S' > 1 - e - ( 30 ) 
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